Development and validation of a nomogram for evaluating the incident risk of carotid atherosclerosis in patients with type 2 diabetes

Introduction The purpose of this study was to evaluate the clinical characteristics of carotid atherosclerotic disease in patients with type 2 diabetes mellitus, investigate its risk factors, and develop and validate an easy-to-use nomogram. Methods 1049 patients diagnosed with type 2 diabetes were enrolled and randomly assigned to the training and validation cohorts. Multivariate logistic regression analysis identified independent risk factors. A method combining least absolute shrinkage and selection operator with 10-fold cross-validation was used to screen for characteristic variables associated with carotid atherosclerosis. A nomogram was used to visually display the risk prediction model. Nomogram performance was evaluated using the C index, the area under the receiver operating characteristic curve, and calibration curves. Clinical utility was assessed by decision curve analysis. Results Age, nonalcoholic fatty liver disease, and OGTT3H were independent risk factors associated with carotid atherosclerosis in patients with diabetes. Age, nonalcoholic fatty liver disease, smoke, HDL-C, and LDL-C were characteristic variables used to develop the nomogram. The area under the curve for the discriminative power of the nomogram was 0.763 for the training cohort and 0.717 for the validation cohort. The calibration curves showed that the predicted probability matched the actual likelihood. The results of the decision curve analysis indicated that the nomograms were clinically useful. Discussion A new nomogram was developed and validated for assessing the incident risk of carotid atherosclerotic in patients with diabetes; this nomogram may act as a clinical tool to assist clinicians in making treatment recommendations.


Introduction
Globally, diabetes mellitus (DM), a group of diseases defined by chronic hyperglycemia, is becoming more prevalent. According to the International Diabetes Federation, by 2021, 537 million people aged 20 to 79 will have diabetes worldwide (1). Diabetes causes many problems, with microvascular and macrovascular consequences accounting for a significant portion of the cost of treatment for type 2 diabetes mellitus (T2DM) (2). Based on a follow-up of deaths in 10 study centers worldwide, cardiovascular disease is the chief cause of death in diabetic patients, explaining 44% of deaths in patients with type 1 diabetes and 52% of deaths in T2DM (3).
Atherosclerosis is the main pathological cause of macroangiopathy in diabetic patients and is an independent risk element for cardiovascular disease. On the other hand, patients with atherosclerosis secondary to DM are particularly susceptible to Cardiovascular and cerebrovascular diseases (CCVD) or other vascular diseases, which will significantly reduce their quality of life (4)(5)(6). As living standards improve, CCVD events are becoming younger, and mortality and disability are increasing (7). Therefore, to reduce cardiovascular and cerebrovascular diseases and their complications, early identification of patients with atherosclerotic cardiovascular disease and effective interventions are important (7). The early involved vessels in atherosclerotic lesions are carotid arteries (8). Carotid atherosclerosis disease (CAD) is associated with the severity of ischemic cardiovascular disease and also acts as a window into the status of other vascular atherosclerosis in the body (9, 10). Guoqing Huang et al. established CAD risk prediction models based on age, body mass index (BMI), diastolic blood pressure (DBP), DM, alanine aminotransferase (ALT), aspartate aminotransferase (AST), and gamma-glutamyl transpeptidase (GGT) for the normal population (11). Fengqi Guo et al. showed that the occurrence of CAD in the T2DM population is closely related to gender, age, hypertension, lipids, diabetic retinopathy, and other factors (12). We did not find the development of a clinical risk prediction model based on this special population by searching existing literature reports. Therefore, in this study, we aimed to create a practical nomogram for predicting the risk of CAD in adults with T2DM. The nomogram is a common visual presentation tool for disease risk prediction models that is simple and easy to use (13). In addition, a clear understanding of the important risk factors for CAD will provide a foundation for preventing and delaying serious complications.

Study population
In this retrospective cross-sectional study, we collected 1049 patients with T2DM hospitalized at Zhongnan Hospital of Wuhan University from January 2015 to May 2021 according to inclusion and exclusion criteria. Those with severe missing information (more than 20% of the total) were excluded, and those with less missing information (less than 20% of the total) were filled by multiple interpolations. Patients were divided into a training cohort and a validation cohort by setting seeds in R. The inclusion criteria for participants were T2DM diagnosis based on the World Health Organization (WHO) guidelines (1999) (14). The main exclusion criteria were as follows: (1) Patients with severe infection, acute myocardial infarction, acute cerebral infarction, severe trauma, malignant tumor, surgery, and another stress state; (2) those taking lipid-lowering drugs; (3) those diagnosed with diabetes during pregnancy; (4) those without complete ultrasound measurements of the carotid artery.
The study met the ethical standards of the Declaration of Helsinki (2013) and was approved by the Ethics Committee of Zhongnan Hospital of Wuhan University (ethical approval code: 2022167K). The flow chart of the participants is shown in Figure 1.

Collection of clinical data
The clinical data were obtained from medical records. Personal history (gender, age, height, body weight; smoking or drinking habits; high blood pressure, and family history of diabetes (FHD)), laboratory serological indexes (liver function, renal function, blood fat, oral glucose tolerance test (OGTT), insulinreleasing test (IRT), glycosylated hemoglobin (HbA1c), and glycosylated serum protein (GSP)), and laboratory data (carotid color ultrasonography, and abdominal ultrasonography) were obtained.

Evaluation of carotid atherosclerotic disease, OGTT test and IRT test
Color Doppler ultrasound (GE Healthcare, Milwaukee, Wisconsin) was used as the diagnostic method for CAD. Experienced sonographers were blinded to all clinical information performed in the evaluation. Carotid intima-media thickness (IMT) and carotid plaque were recorded during the ultrasound examination; IMT was defined as the distance between the leading edge of the first and second echo lines of the distant arterial wall; the presence of CAP was defined as a local increase in thickness of 0.5 mm or 50% of the surrounding IMT value (15).
Subjects underwent standard OGTT and IRT tests the following morning after fasting (8-10 hours). After fasting venous blood  Frontiers in Endocrinology frontiersin.org collection for determination of fasting glucose (OGTT0H) and fasting insulin (IRT0H), patients were instructed to take 75 g of anhydrous glucose within 5 minutes, and blood samples were collected from the anterior elbow vein at intervals of 30, 60, 120 and 180 minutes for determination of plasma glucose concentrations (OGTT0.5H, OGTT1H, OGTT2H, OGTT3H) and serum insulin concentrations (IRT0.5H, IRT1H, IRT2H, IRT3H).

Statistical analysis
The Kolmogorov-Smirnov test was used for testing the normality distribution of continuous data. Continuous data that conformed to normal distribution were expressed as "mean ± standard deviation (x ± s) ", and t-test was used for comparison between two groups; continuous data that did not conform to normal distribution were expressed as "median (lower quartile, upper quartile) " and Mann-Whitney U test was used for comparison between groups. The frequency of dichotomous variables was performed by c2 analysis and was expressed as "frequency (proportion) ". The independent risk factors were identified using multivariate logistic regression analyses. First, Least Absolute Shrinkage and Selection Operator (LASSO) regression was performed using the glmnet package to screen out relevant variables. Then, a multivariate logistic regression analysis was performed on the selected variables. Nomogram was built via the rms and regplot packages. Sensitivity and specificity were defined by receiver operating characteristic (ROC) curves drawn by the pROC software package. The calibration curves were drawn using the rms package. The rmda package was used for decision curve analysis (DCA). SPSS 18.0 (SPSS Inc, Chicago, IL, USA) and R version 4.0.3 (https://www.rproject.org/) were used for statistical analysis, and P < 0.05 or P < 0.1 was considered statistically significant. Flowchart of the participants.

Clinical characteristics of the patients with T2DM
Patients were randomly divided into a training cohort (n = 733) and a validation cohort (n = 316) according to the 7:3 ratio by setting seeds in R (16). The patients' laboratory examination findings and clinical characteristics are shown in Table 1. Most patients with T2DM were male (63% and 67%, respectively). The mean age of the training cohort was 56 (48, 65) years, and that of the validation cohort was 56.5 (50, 65) years. Except for the aspartate aminotransferase (AST), there were no significant differences between the other characteristics, indicating that random grouping does not produce bias. The results of the Univariate analysis are shown in Table 2. The mean ages of the no-CAD group (HC) and the CAD group were 53 (43,60) years and 61 (54,68) years, respectively. The two groups differed in age, NAFLD, BMI, liver function, renal function, lipids, and blood pressure.

Independent risk factors
Based on univariate analysis (Table 2), we selected candidate variables with P<0.1 for inclusion in the multivariate logistic regression analysis. Independent risk factors associated with CAD in T2DM were finally identified, including age, NAFLD, and OGTT3H (Table 3).

Construction of predictive models
A total of 733 patients in the training cohort were included in the LASSO regression analysis, and 13 non-zero characteristic variables were screened ( Figure 2). Next, to further develop predictive models for CAD, the aforementioned indicators were included in multivariate logistic regression analysis; we selected age, NAFLD, smoking, HDL-C, and LDL-C for model construction ( Table 4).

Development of the nomogram to predict carotid atherosclerosis
A nomogram for predicting CAD in T2DM was created based on the results of a multivariate logistic regression model (Figure 3). The results showed the highest risk scores (100) for age (≥90 years of age). Visualization of risk factors for CAD in T2DM can predict the risk of individual CAD. First, each unique CAD risk factor was projected upward to the first row of the scale to obtain a score for each element; then, the scores for the five risk factors were summed to get a total score. The higher the total score, the higher the CAD risk of the individual.

Validation of the nomogram to predict carotid atherosclerosis
The ROC curve was used to assess the predictive accuracy of the nomograms. The results showed that the area under the ROC curve for the training cohort was 0.763 (95% CI = 0.73-0.80), and that of the validation cohort was 0.717 (95% CI = 0.66-0.77). The C-index of the training and validation cohorts were 0.76 and 0.72. The abovementioned results indicate that the nomogram has an excellent predictive effect on CAD ( Figure 4). Next, a calibration curve was used to evaluate the deviation between the nomogram's predicted results and actual values. The predicted results showed good agreement for both the training and validation cohorts  ( Figure 5). DCA curves were then used to assess the clinical usefulness of the nomograms. The results showed that in the training cohort, using this nomogram to predict CAD risk was more useful than the all-intervention or no-intervention methods if the threshold probabilities for patients and physicians were >2% and <76%, respectively. The decision curves show that in the validation cohort, CAD risk prediction with our nomogram is more informative than the all-intervention or no-intervention scheme when the threshold probabilities for patients and physicians are >3% and <80%, respectively ( Figure 6). These results suggest that nomograms have excellent predictive power for CAD.

Discussion
The nomogram model is a reliable statistical tool and is widely used with diagnostic prediction models of diabetic complications (13,17). It uses a very intuitive graphical representation, which interprets risk models very simply and easily (18,19). In this study, a CAD risk prediction model for patients with T2DM was developed using nomograms. The model was validated by the ROC curve, calibration curve, and C index, and the predicted and observed values were found to be in general agreement, indicating that the nomogram prediction model in this study is reliable. DCA showed that the nomogram has clinical applications. In the model's process, five risk factors, age, fatty liver, smoking, high-density lipoprotein, and low-density lipoprotein, were screened and identified as risks of CAD in T2DM. According to the nomogram model, age was the largest risk factor among the five factors, followed by LDL-C and HDL-C, fatty liver, and smoking history. In addition, age, NAFLD, and OGTT3H were identified as independent risk factors for CAD in T2DM.
Most risk assessment methods for the development of CAD in T2DM include risk factor analysis. A study conducted in 2020 reported that age, gender, history of hypertension, coronary artery disease, and diabetes are risk factors for carotid atherosclerotic plaque formation (20). The risk factors for CAD in T2DM are similar to common risk factors for stroke in the Chinese population (21). Moreover, advanced age, male, lower education, hypertension, diabetes, passive smoking, and high LDL-C levels are independent risk factors for early atherosclerosis (22). From the results of these studies, several risk prediction models have been designed to assess the risk of carotid atherosclerosis disease in T2DM. However,

FIGURE 2
Clinical index selection using LASSO regression analysis. limitations of these studies, such as individualized differences in populations, have led to a lack of simple and intuitive tools to facilitate the use of these models, so few of them have been applied in clinical practice. In our study, we developed the first nomogram describing CAD risk factors in T2DM. We performed an internal validation with the area under the ROC curves of 0.763 and 0.717 for the training and validation cohorts, respectively. The decision curves of the training cohort showed that using this nomogram to predict CAD risk in the present study was more favorable than allpatient intervention scenarios or no-intervention scenarios if the threshold probabilities for patients and physicians were 2% and 76%, respectively. Also, the DCA of the validation cohort was shown to be clinically useful.
The nomogram developed in this study allowed direct prediction and visual analysis of factors that greatly affect the risk of CAD in T2DM. Moreover, age is an unavoidable factor in many chronic diseases. According to a cohort study including 318,083 patients with T2DM from Sweden, age is greatly associated with T2DM with cardiovascular and mortality risk, which is consistent with the findings of our study (23). The possible mechanism for this is that the physiological and endocrine functions of the body diminish with age, leading to atherosclerosis; furthermore, older people tend to undergo physiological and structural changes in their blood vessels, thus reducing nitric oxide utilization and leading to increased production of angiotensin (24). Meanwhile, advanced age is an immutable risk factor for diabetes (25). Advanced age leads to the aging of pancreatic b-cells, resulting in defective insulin secretion and decreased glucose sensitivity (26); hyperglycemia is a risk factor for atherosclerosis (27). Persistent hyperglycemia causes changes in most cells in the vascular tissue, which accelerates atherosclerosis (28). Therefore, advanced age further affects blood glucose and aggravates atherosclerosis.
Our study also showed that NAFLD is strongly associated with the development of CAD in patients with T2DM. A study involving 8020 patients showed that persistent NAFLD was associated with an increased risk of developing subclinical CAD (29). Also, NAFLD is greatly associated with diabetes (30), further contributing to the development of CAD. The possible mechanism is that both NAFLD and T2DM are associated with a state of systemic hypoinflammation, which may encourage atherosclerosis through the secretion of various cytokines such as interleukin-6, interleukin-1, tumor necrosis factor-a, and acute phase proteins (C-reactive protein, fibrinogen, and fetal protein-A) (31).  Nomogram: A Nomogram was created to determine the incidence of carotid atherosclerosis. A total score was obtained from the values of each index, and the CAD risk rate corresponding to the total score was the predicted rate of the Nomogram.
Smoking is still a major health hazard that significantly affects the morbidity and mortality of the cardiovascular disease. All stages of atherosclerosis are affected by it (32). The increased risk of cardiovascular disease in patients with diabetes is partly associated with a high prevalence of other cardiovascular disease risk factors. Therefore, the management of modifiable cardiovascular disease risk factors can minimize the risk of vascular complications in patients with diabetes (33).
The pathological process leading to atherosclerosis is usually associated with elevated LDL-C concentrations, which alter cell permeability and progressively affect the arterial wall (34). HDL-C may reduce the risk of cardiovascular events (35,36). In addition, a large body of preclinical and mechanistic evidence suggests that HDL has an antidiabetic function and improves glycemic control by increasing insulin sensitivity and b-cell function (37). This is consistent with our study.
Therefore, it was essential to apply five risk predictors in our model. Despite the good performance of our nomogram, the study has some possible limitations. First, it was a retrospective study. The data collected did not provide information on other risk factors for A B FIGURE 6 The nomogram model predicted the decision curve of carotid atherosclerosis in patients with T2DM. The y-axis measures the net return. The dotted line represents the CAD risk nomogram. The thin solid line represents the assumption that all patients are CAD. The thick solid line represents the assumption that no patients are CAD. (A) Decision curves for the training cohort show that using this nomogram predicts more benefit for CAD risk than intervening with the all-patient scenario or the no-intervention scenario if the threshold probabilities for patients and physicians are >2% and <76%, respectively. (B) Decision curves for the validation cohort show that using this nomogram to predict CAD risk adds more benefit than intervening with an all-patient regimen or a no-intervention regimen if the threshold probabilities for patients and physicians are >3% and <80%, respectively.

FIGURE 5
Calibration curve of nomogram model for predicting carotid atherosclerosis in patients with T2DM. diabetes, such as lifestyle factors, including chronic high-sugar diet or lack of exercise. Second, only internal validation was used. Third, all patients were from a single center with a limited sample size. Differences in ethnicity were not taken into account, which is not a good representation of the whole population. Finally, our collection had data with some missing values, which could introduce selection bias if participants with incomplete records from the full case study were completely excluded from model building. Therefore, we used multiple interpolations to replace missing values in the analysis. In the future, it is hoped that our study can be a collaborative effort of multiple centers to collect as many variables as possible to continuously test and modify predictive models in clinical practice.
In conclusion, we identified three CAD associated independent risk factors in T2DM, including age, NAFLD, and OGTT3H. Meanwhile, we screened and visualized five clinical indicators closely associated with the occurrence of CAD in T2DM using LASSO and logistics regression, which can be used as a clinical tool for clinicians to perform personalized screening by validating their model accuracy and good clinical usefulness.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Ethics statement
The studies involving human participants were approved by by the Ethics Committee of the Zhongnan Hospital of Wuhan University (ethical approval code: 2022167K). Written informed consent from the participants was not required to participate in this study in accordance with the national legislation and the institutional requirements.

Author contributions
XF wrote the manuscript, performed data analysis, and prepared Figures 1-5; LR and YXi collected the data and prepared Tables 1-2; YXu designed the entire study, provided financial support, and supervised the whole process. All authors contributed to the article and approved the submitted version.